Lecture 3

Temporal Models and Smoothing

Sara Martino

Dept. of Mathematical Science, NTNU

Janine Illian

University of Glasgow

Jafet Belmont

University of Glasgow

Some things were missing yesterday …

Some inlabru shortcuts

In the slides we have fitted the model as:

cmp = ~ -1 + beta0(1) beta1(covariate, model = "linear")
formula = y ~ beta0 + beta1
lik = bru_obs(formula = formula,
              famuly  = "gaussian",
              data = df)
fit = bry(cmp, lik)

but in the practical you have

cmp = ~ -1 + beta0(1) beta1(covariate, model = "linear")

lik = bru_obs(formula = y~.,
              famuly  = "gaussian",
              data = df)
fit = bry(cmp, lik)
  1. You don’t have to define a formula outside of the bru_obs() function..it can be defines also inside the function!
  1. The expression y ~ . just means “take all the components and sum them together”. It also tells the bru() function that your predictor is linear.

predict() and generate() functions

In the practicals you have used the predict() function to get results from the fitted model.

  • predict() simulates from the fitted posterior distribution \(\pi(\mathbf{y}|\mathbf{u},\theta)\), computes what you ask and produces a summary of the samples (mean, sd, quantiles, etc)

  • generate() simulates from the fitted posterior distribution \(\pi(\mathbf{y}|\mathbf{u},\theta)\), computes what you ask and return the samples!

Example

Let’s fit the model for the Tokyo rainfall data \[ \begin{aligned} y_t|\eta_t&\sim\text{Bin}(n_t, p_t),\qquad i = 1,\dots,366\\ \eta_t &= \text{logit}(p_t)= \beta_0 + f(\text{time}_t) \end{aligned} \]

data("Tokyo")
cmp= ~ -1 + Intercept(1) + time(time, model ="rw2")
formula = y ~ Intercept + time
lik = bru_obs(formula = formula,
              data = Tokyo,
              Ntrials = n,
              family = "binomial")
fit = bru(cmp, lik)

We now want to extract results…what do we want?

  • The time effect \(f(\text{time}_t)\) ?
  • The linear predictor \(\eta_t = \beta_0 + f(\text{time}_t)\)?
  • The estimated probability offit precipitation \(p_t = \text{inv_logit}(\eta_t)\) ?

We can get all of them with the predict() or generate() functions!

Example - predict()

preds1 = predict(object = fit, newdata = Tokyo, ~ time)
preds2 = predict(object = fit, newdata = Tokyo, ~ Intercept + time)
inv_logit = function(x) ((1 + exp(-x))^(-1))
preds3 = predict(object = fit, newdata = Tokyo, ~ inv_logit(Intercept + time))

or

inv_logit = function(x) ((1 + exp(-x))^(-1))
preds = predict(object = fit, newdata = Tokyo, 
                ~ data.frame(time_eff = time,
                             lin_pred = Intercept + time,
                             probs = inv_logit(Intercept + time)),
                n.samples = 1000
                )
# preds is then a list                
round(preds$probs[1:3,],3)
  y n time  mean    sd q0.025  q0.5 q0.975 median sd.mc_std_err mean.mc_std_err
1 0 2    1 0.174 0.087  0.057 0.156  0.388  0.156         0.003           0.003
2 0 2    2 0.172 0.082  0.059 0.156  0.369  0.156         0.003           0.003
3 1 2    3 0.169 0.077  0.062 0.157  0.347  0.157         0.002           0.003

Example - predict()

preds$probs %>% ggplot() + geom_line(aes(time, mean)) +
  geom_ribbon(aes(time, ymin = q0.025, ymax = q0.975), alpha = 0.5)

Example - generate()

samples = generate(object = fit, newdata = Tokyo, 
                ~ data.frame(time_eff = time,
                             lin_pred = Intercept + time,
                             probs = inv_logit(Intercept + time)),
                n.samples = 20
                )
# samples is now a list of length 20 (n.samples) each element of the list looks like:

samples[[1]][1:3,]
    time_eff  lin_pred     probs
1 -0.3865601 -1.560139 0.1736267
2 -0.4737201 -1.647299 0.1614743
3 -0.5418052 -1.715384 0.1524666

Example - generate()

data.frame(time = Tokyo$time, sapply(samples, function(x) x[,1])) %>%
  pivot_longer(-time) %>%
  ggplot() + geom_line(aes(time, value, group = name, color = factor(name))) +
  theme(legend.position = "none")

Effects of categorical variables and Interactions

Both this things require a little more thinking when using inlabru than when using lm or similar

Let’s look at the mtcars dataset that is in R

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Effects of categorical variables

We want to fit a model where rank is the only covariate.

gear has three categories: 3 4``5

m1 = lm(mpg~ gear, data = df)
m1$coef
(Intercept)       gear4       gear5 
  16.106667    8.426667    5.273333 

How do we do this in inlabru?

  • Option 1: Use the model.matrix() function in R
# create the model matrix
cov_effect = model.matrix(~ gear, data = df)
cov_effect[1:2,]
              (Intercept) gear4 gear5
Mazda RX4               1     1     0
Mazda RX4 Wag           1     1     0
# attach this to your data
df = cbind(df, cov_effect[,-1])

cmp1 = ~ Intercept(1) + gear_4(gear4, model = "linear") +
  gear_5(gear5, model= "linear")
lik1 = bru_obs(formula = mpg ~.,
               data = df)
fit1 = bru(cmp1, lik1)
fit1$summary.fixed$mean
[1] 16.103112  8.414875  5.253896

Effects of categorical variables

We want to fit a model where rank is the only covariate.

rank has three categories: Prof AsstProf``AssocProf

m1 = lm(mpg~ gear, data = df)
m1$coefficients
(Intercept)       gear4       gear5 
  16.106667    8.426667    5.273333 

How do we do this in inlabru?

  • Option 2: fixed effect are just random effects with fixed precision
df$gear_id = as.numeric(df$gear) # inlabru needs values like 1,2,3 for random effects

cmp2 = ~ -1 +  gear_effect(gear_id, model = "iid", fixed = T,initial = -6)
# or: cmp2 = ~ Intercept(1) +  gear_effect(gear_id, model = "iid", fixed = T,initial = -6, constr = T)
lik2 = bru_obs(formula = mpg ~.,
               data = df)
fit2 = bru(cmp2, lik2)
fit2$summary.random$gear_effect$mean
[1] 16.04861 24.42290 21.15058

What is happening here??

This is just a re-parametrization problem

c(m1$coef[1], m1$coef[1] + m1$coef[2],m1$coef[1] + m1$coef[3])
(Intercept) (Intercept) (Intercept) 
   16.10667    24.53333    21.38000 

Interaction between two categorical variables

Say we want to fit this model

df = mtcars %>% mutate(gear = factor(gear), vs = factor(vs))
m2 = lm(mpg ~ gear * vs, data = df)
(Intercept)       gear4       gear5         vs1   gear4:vs1   gear5:vs1 
  15.050000    5.950000    4.075000    5.283333   -1.043333    5.991667 

Here the best solution is to use the model.matrix() again

# create the model matrix
cov_effect = model.matrix(~ gear * vs, data = df)
cov_effect[1:1,]
(Intercept)       gear4       gear5         vs1   gear4:vs1   gear5:vs1 
          1           1           0           0           0           0 
df = cbind(df, cov_effect[,-1])       # attach this to your data

cmp3 = ~ Intercept(1) + gear_4(gear4, model = "linear") +
  gear_5(gear5, model = "linear") + vs_1(vs1, model = "linear") +
  gear4_vs1(`gear4:vs1`, model = "linear") + gear5_vs1(`gear5:vs1`, model = "linear") 

lik3 = bru_obs(formula= mpg ~ ., data = df)
fit3 = bru(cmp3, lik3)
fit3$summary.fixed$mean
[1] 15.0434617  5.8986234  4.0890402  5.2876293 -0.9880579  5.8809788

Interaction categorical and continous variable

df = mtcars %>% mutate(gear = factor(gear))
m2 = lm(mpg ~ gear * disp, data = df)
(Intercept)       gear4       gear5        disp  gear4:disp  gear5:disp 
24.51556635 15.05196338  7.14538044 -0.02577046 -0.09644222 -0.02500467 

Option 1: Again we can use the model.matrix() option:

# create the model matrix
cov_effect = model.matrix(~ gear * disp, data = df)
cov_effect[1:1,]
(Intercept)       gear4       gear5        disp  gear4:disp  gear5:disp 
          1           1           0         160         160           0 
df = cbind(df, cov_effect[,-1])       # attach this to your data

cmp4 = ~ Intercept(1) + gear_4(gear4, model = "linear") +
  gear_5(gear5, model = "linear") + disp(disp, model = "linear") +
  gear4_disp(`gear4:disp`, model = "linear") + gear5_disp(`gear5:disp`, model = "linear") 

lik4 = bru_obs(formula= mpg ~ ., data = df)
fit4 = bru(cmp4, lik4)
fit4$summary.fixed$mean
[1] 24.50105803 14.96909092  7.11470263 -0.02572924 -0.09575832 -0.02486880

Interaction categorical and continous variable

df = mtcars %>% mutate(gear = factor(gear))
m2 = lm(mpg ~ gear * disp, data = df)
(Intercept)       gear4       gear5        disp  gear4:disp  gear5:disp 
24.51556635 15.05196338  7.14538044 -0.02577046 -0.09644222 -0.02500467 

Option 2: fixed effects are random effect with fixed precision!

df$gear_id = as.numeric(df$gear) # inlabru needs values like 1,2,3 for random effects

cmp5 = ~ -1 +  gear_effect_int(gear_id, model = "iid", fixed = T,initial = -6) +
  gear_effect_slope(gear_id, disp, model = "iid", fixed = T,initial = -6)
lik5 = bru_obs(formula = mpg ~.,
               data = df)
fit5 = bru(cmp5, lik5)
fit5$summary.random$gear_effect_int$mean
[1] 24.15667 38.93827 31.16922
fit5$summary.random$gear_effect_slope$mean
[1] -0.02475096 -0.11752709 -0.04884811

Again..the differences are a matter of parametrization!

Interaction categorical and continous variable

pred_df = data.frame(disp = rep(seq(70:473),3),
                     gear = factor(rep(3:5, each = length(seq(70:473))))) %>%
  mutate(gear_id = as.numeric(gear))

lm_pred = cbind(pred_df, pred = predict(m2, newdata = pred_df))

inla_pred = predict(fit5, pred_df, ~ gear_effect_int + gear_effect_slope)

Temporal models and smoothing

Motivation

Data are often observed in time, and time dependence is often expected.

  • Observations are correlated in time

Motivation

  1. Smoothing of the time effect

What is our goal?

  1. Smoothing of the time effect

Note: We can use the same model to smooth covariate effects!

What is our goal?

  1. Smoothing of the time effect

  2. Prediction

We can “predict” any unobserved data, not only future data, e.g. gaps in the data etc.

Modelling time with INLA

Time can be indexed over a

  • Discrete domain (e.g., years)

  • Continuous domain

Modelling time with INLA

Time can be indexed over a

  • Discrete domain (e.g., years)

    • Main models: RW1, RW2 and AR1

    • Note: RW1 and RW2 are also used for smoothing covariates

  • Continuous domain

    • Here we use the so-called SPDE-approach (more on this later)

Discrete time modelling

Example - Height of Lake Erie in time

Goal we want understand the pattern and predict into the future

Random Walk models

Random walk models encourage the mean of the linear predictor to vary gradually over time.

They do this by assuming that, on average, the time effect at each point is the mean of the effect at the neighbouring points.

  • Random Walk of order 1 (RW1) we take the two nearest neighbours

  • Random Walk of order 2 (RW2) we take the four nearest neighbours

Random walks of order 1

Idea: \(\longrightarrow\ u_t = \text{mean}(u_{t-1} , u_{t+1}) + \text{Gaussian error with precision } \tau\)

Definition

\[ \pi(\mathbf{u} \mid \tau) \propto \exp\!\left( -\frac{\tau}{2} \sum_{t=1}^{T-1} (u_{t+1} - u_t)^2 \right) = \exp\!\left(-\tfrac{1}{2} \, \mathbf{u}^{\top} \mathbf{Q}\ \mathbf{u}\right) \]

\[ \mathbf{Q} = \tau \begin{bmatrix} 1 & -1 & & & & \\ -1 & 2 & -1 & & & \\ & & \ddots & \ddots & \ddots & \\ & & & -1 & 2 & -1 \\ & & & & -1 & 1 \end{bmatrix} \]

Random walks of order 1

Idea: \(\longrightarrow\ u_t = \text{mean}(u_{t-1} , u_{t+1}) + \text{Gaussian error with precision } \tau\)

Definition

\[ \pi(\mathbf{u} \mid \tau) \propto \exp\!\left( -\frac{\tau}{2} \sum_{t=1}^{T-1} (u_{t+1} - u_t)^2 \right) = \exp\!\left(-\tfrac{1}{2} \, \mathbf{u}^{\top} \mathbf{Q}\ \mathbf{u}\right) \]

  1. Role of the precision parameter \(\tau\) and prior distribution
  2. RW as intrinsic model

What is the role of the precision parameter?

  • \(\tau\) says how much \(u_t\) can vary around its mean

    • Small \(\tau\) \(\rightarrow\) large variation \(\rightarrow\) less smooth effect
    • Large \(\tau\) \(\rightarrow\) small variation \(\rightarrow\) smoother effect

We need to set a prior distribution for \(\tau\).

A common option is the so called PC-priors

Penalized Complexity (PC) priors

  • PC priors are easily available in inlabru for many model parameters
  • They are built with two principles in mind:

    1. The prior discourages overdispersion by penalizing deviation from a base model

  • A line is the base model

  • We want to penalize more complex models

Penalized Complexity (PC) priors

  • PC prior are easily available in inlabru for many model parameters

  • They are built with two principle in mind:

    1. The prior discourages overdispersion by penalizing deviation from a base model
    2. User-defined scaling

\[ \begin{eqnarray} \sigma = \sqrt{1/\tau} \\ \text{Prob}(\sigma>U) = \alpha;\\ \qquad U>0, \ \alpha \in (0,1) \end{eqnarray} \]

  • \(U\) an upper limit for the standard deviation and \(\alpha\) a small probability.

  • \(U\) a likely value for the standard deviation and \(\alpha=0.5\).

Example

The Model

\[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim \mathcal{N}(\eta_i,\sigma^2)\\ \eta_i & = \beta_0 + f(t_i)\\ f(t_1),f(t_2),\dots,f(t_n) &\sim \text{RW2}(\tau) \end{aligned} \]

The code

cmp = ~ Intercept(1) +
  time(year, model = "rw1",
       hyper = list(prec =
                      list(prior = "pc.prec",
                           param = c(0.5, 0.5))))

RW as intrinsic models

RW1 defines differences, not absolute levels:

  • Only the changes between neighbouring terms are modelled.

  • Mathematically, \[ (u_1,\dots,u_n)\text{ and }(u_1+a,\dots,u_n+a) \] produce identical likelihoods — they’re indistinguishable.

This means:

  • If \(u_t\sim\text{RW}1\) then \[ \begin{aligned} \eta_t & = \beta_0 + u_t \\ & =(\beta_0+k) + (u_t-k) \\ & = \beta_0^* + u_t^* \end{aligned} \]

so parameters are not well-defined.

Solution:

  • Sum to zero constraint \(\sum_{i = 1}^n u_i = 0\)
  • This is included in the model be default

RW as intrinsic models

cmp1 = ~ Intercept(1) + time(year, model = "rw1", constr = TRUE)
cmp2 = ~ Intercept(1) + time(year, model = "rw1", constr = FALSE)
lik = bru_obs(formula = Erie~.,
              data = lakes)

fit1 = bru(cmp1,lik)
fit2 = bru(cmp2,lik)
[1] "FIT1 - Intercept"
             mean    sd 0.025quant 0.5quant 0.975quant    mode kld
Intercept 174.138 0.001    174.135  174.138    174.141 174.138   0
[1] "FIT2 - Intercept"
          mean     sd 0.025quant 0.5quant 0.975quant mode kld
Intercept    0 31.623    -62.009        0     62.009    0   0
[1] "FIT1 - RW1 effect"
    ID   mean    sd 0.025quant 0.5quant 0.975quant   mode kld
1 1918 -0.122 0.014     -0.150   -0.122     -0.094 -0.122   0
2 1919  0.040 0.014      0.011    0.040      0.067  0.040   0
[1] "FIT2 - RW1 effect"
    ID    mean     sd 0.025quant 0.5quant 0.975quant    mode kld
1 1918 174.017 31.623    112.008  174.017    236.025 174.017   0
2 1919 174.177 31.623    112.168  174.177    236.185 174.177   0

Random walks of order 2

  • Just like RW1, but now we consider 4 neighbours instead of 2

\[ u_t = \text{mean}(u_{t-2} ,u_{t-1} , u_{t+1}, u_{t+2} ) + \text{some Gaussian error with precision } \tau \]

  • RW2 are smoother than RW1

  • The precision has the same role as for RW1

Example

cmp1 = ~ Intercept(1) +
  time(year, model = "rw1",
       scale.model = T,
       hyper = list(prec =
                      list(prior = "pc.prec",
                           param = c(0.3,0.5))))

cmp2 = ~ Intercept(1) +
  time(year, model = "rw2",
       scale.model = T,
       hyper = list(prec =
                      list(prior = "pc.prec",
                           param = c(0.3,0.5))))


lik = bru_obs(formula = Erie~ .,
              data = lakes)

fit1 = bru(cmp1, lik)
fit2 = bru(cmp2, lik)

NOTE: the scale.model = TRUE option scales the \(\mathbf{Q}\) matrix so the precision parameter has the same interpretation in both models.

RW models as smoothers for covariates

  • RW models are discrete models

  • Covariates are often recorded as continuous values

  • The function inla.group() will bin covariate values into groups (default 25 groups)

inla.group(x, n = 25, method = c("cut", "quantile"))

  • Two ways to bin

    • cut (default) splits the data using equal length intervals
    • quantile uses equi-distant quantiles in the probability space.

RW models as smoothers for covariates - Example

The data are derived from an anthropometric study of 892 females under 50 years in three Gambian villages in West Africa.

Age - Age of respondent (continuous)

triceps - Triceps skinfold thickness.

triceps$age_group = inla.group(triceps$age, n = 30)

Model fit and results

cmp = ~ Intercept(1) + cov(age_group, model = "rw2", scale.model =T)
lik = bru_obs(formula = triceps ~.,
              data = triceps)

fit = bru(cmp, lik)
pred = predict(fit, triceps, ~ Intercept + cov)

Summary RW (1 and 2) models

  • Latent effects suitable for smoothing and modelling temporal data.

  • One hyperparameter: the precision \(\tau\)

    • Use PC prior for \(\tau\)
  • It is an intrinsic model

    • The precision matrix \(\mathbf{Q}\) is rank deficient

    • A sum-to-zero constraint is added to make the model identifiable!

  • RW2 models are smoother than RW1

Auto Regressive Models of order 1 (AR1)

Definition

\[ u_t = \phi u_{t-i} + \epsilon_t; \qquad \phi\in(-1,1), \ \epsilon_t\sim\mathcal{N}(0,\tau^{-1}) \]

\[ \pi(\mathbf{u}|\tau)\propto\exp\left(-\frac{\tau}{2}\mathbf{u}^T\mathbf{Q}\mathbf{u}\right) \]

with

\[ \mathbf{Q} = \begin{bmatrix} 1 & -\phi & & & & \\ -\phi & (1+\phi^2) & -\phi & & & \\ & & \ddots & \ddots & \ddots & \\ & & & -\phi & (1+\phi^2) & -\phi \\ & & & & -\phi & 1 \end{bmatrix} \]

AR1: Hyperparameters and prior

The AR1 model has two parameters

  • The precision \(\tau\)
  • The autocorrelation (or persistence) parameter \(\phi\in(0,1)\)

AR1: Hyperparameters and prior

The AR1 model has two parameters

  • The precision \(\tau\)

    • PC prior as before - Baseline \(\tau=0\) pc.prec \[ \text{Prob}(\sigma > u) = \alpha \]
  • The autocorrelation (or persistence) parameter $(-1,1)

    • Two choices of PC priors
  1. Baseline \(\phi = 0\) pc.cor0

\[ \begin{eqnarray} \text{Prob}(|\rho| > u) = \alpha;\\ -1<u<1;\ 0<\alpha<1 \end{eqnarray} \]

AR1: Hyperparameters and prior

The AR1 model has two parameters

  • The precision \(\tau\)

    • PC prior as before - Baseline \(\tau=0\) pc.prec \[ \text{Prob}(\sigma > u) = \alpha \]
  • The autocorrelation (or persistence) parameter $(-1,1)

    • Two choices of PC priors
  1. Baseline \(\phi = 1\) pc.cor1

\[ \begin{eqnarray} \text{Prob}(\rho > u) = \alpha;&\\ -1<u<1;\qquad &\sqrt{\frac{1-u}{2}}<\alpha<1 \end{eqnarray} \]

Example - AR1 and RW1 for earthquakes data

The Model

\[ \begin{aligned} y_t|\eta_t & \sim \text{Poisson}(\exp(\eta_t))\\ \eta_t & = \beta_0 + u_t\\ 1.\ u_t&\sim \text{RW}1(\tau)\\ 2.\ u_t&\sim \text{AR}1(\tau, \phi)\\ \end{aligned} \]

Number of serious earthquakes per year

hyper = list(prec = list(prior = "pc.prec", param = c(1,0.5)))
cmp1 = ~ Intercept(1) + time(year, model = "rw1", scale.model = T,
                             hyper = hyper)
cmp2 = ~ Intercept(1) + time(year, model = "ar1",
                             hyper = hyper)


lik = bru_obs(formula = quakes ~ .,
              family = "poisson",
              data = df)

fit1 = bru(cmp1, lik)
fit2 = bru(cmp2, lik)

Example - RW1 and AR1

Estimated trend

Example - RW1 and AR1

Predictions

AR1 vs. RW models

  • RW1 models can be seen as a limit for AR1 models

  • They are not too different when smoothing, but can give different predictions.

Continuous time modelling

Continuous time modelling

  • Sometimes the data are not collected at discrete time points but over continous time

  • One solution (not a bad one usually) is to discretize the time and use RW model (AR models are then harder to justify..)

  • A different solution is to use a continuous smoother based on a continuous Gaussian Random Field (GRF) \(\omega(t)\)

What is this?

  • Is a process defined everywhere in the time domain \(\mathcal{T}\) such that, if i take \(n\) points \(t_1,t_2,\dots,t_n\) then

\[ (\omega(t_1),\omega(t_2),\dots,\omega(t_n))\sim\mathcal{N}(\mathbf{0},\mathbf{\Sigma}_{\omega(t_1),\omega(t_2),\dots,\omega(t_n)} ), \qquad \text{ for any } t_1,t_2,\dots,t_n\in\mathcal{T} \] Where \(\mathbf{\Sigma}_{\omega(t_1),\omega(t_2),\dots,\omega(t_n)}\) is a variance-covariance matrix.

To define \(\omega(t), t\in\mathcal{T}\), we have to define \(\Sigma\)

BUT…

Covariance matrices are not nice objects!

  • Hard to define
  • Dense
  • computationally hard to deal with

The SPDE approach

We define a (Matern) GRF as the solution of a stochastic partial differential equation (SPDE)

\[ (\kappa^2-\Delta)^{\alpha/2}\omega(t) = W(t) \]

What is this?

  • \(W(t)\) is random noise
  • \(\omega(t)\) is the smooth process we want
  • \((\kappa^2-\Delta)^{\alpha/2}\) is an operator that “smooths” the white noise.
  • \(\kappa\) and \(\alpha\) are parameters

One analogy..

  • Think of white noise as someone randomly tapping along a rope.

    • The taps are independent.
    • Some are big, some are small.
    • They have no relation to each other.
    • This is just like White noise!
  • But the rope has tension and stifness

    • the tension spreads each tap to nearby points.
    • Sharp jumps are softened.
  • The rope settles into a smooth wiggly shape (just like a Matérn field!)

The parameters:

  • Tension = controls how far randomness propagates (\(\kappa\))

  • Stiffness = controls how smooth the field becomes (\(\alpha\))

SPDE as a smoothing operator

Solving the SPDE

Ok…but we still need to solve the SPDE to find \(\omega(t)\)!

Now we need to discretize the domain into T points (we cannot compute on the continuous!)

We represent our solution as

\[ \omega(t) = \sum_{i = 1}^T\phi_i(t)w_i \]

Where

  • \(\phi_i(s)\) are (known) basis functions
  • \(w_i\) are (unknown) weights

In summary

  • The continuous Matern GRF is the solution of a SPDE and is represented as

\[ \omega(t) = \sum_{i = 1}^T\phi_i(t)w_i \]

  • The weights vector \(\mathbf{w} = (w_1,\dots,w_N)\) is Gaussian with a sparse precision matrix \(\longrightarrow\) Computational convenience

  • The field has two parameters

    • The range \(\rho\)
    • The marginal variance \(\sigma^2\)
  • These parameters are linked to the parameters of the SPDE

  • We need to assign prior to them

PC priors for range and marginal variance

Remember:

  • PC priors penalize distance from a basis model
  • Are expressed in a user friendly way

For the range we have: \[ \text{Prob}(\rho<\rho_0) = p_{\rho} \]

For the marginal standard deviation we have: \[ \text{Prob}(\sigma<\sigma_0) = p_{\sigma} \]

PC priors for range and marginal variance

In practice…

If you want to use a continous model for time (or for smoothing a covariate) you need to:

  1. Define a 1-dimensional mesh over the domain of interest (allowing for some buffer around your domain)
  • How many nodes?
  • Where to place them?
# domain of interest 0-10
locs = seq(-3,13,1)
mesh1d = fm_mesh_1d(locs, degree = 1, boundary = "neumann")
  1. Define the SPDE model and the prior
spde <- inla.spde2.pcmatern(mesh1d,
  prior.range = c(10, 0.5), # Prob(range<10) = 0.5
  prior.sigma = c(1, 0.5)   # Prob(sd>1) = 0.5
)
  1. Use the SPDE model in a component of your model
cmp = ~ Intercept(1) + smooth(covariate, model = spde)

Example

Let’s go back to the Triceps skinfold thickness example.

The Model

\[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim \mathcal{N}(\eta_i,\sigma^2)\\ \eta_i & = \beta_0 + \omega(\text{age}_i)\\ \omega(\text{age}_i) & \sim \text{GF}(\rho_{\omega},\sigma_{\omega}) \end{aligned} \]

triceps = read_csv("Data/triceps.csv")
triceps[1:3,c(1,3)]
# A tibble: 3 × 2
    age triceps
  <dbl>   <dbl>
1 12.1     3.40
2  9.91    4   
3 10.0     4.20

The code

#|
locs = seq(-5,60,5)
mesh1d = fm_mesh_1d(locs, degree = 2, boundary = "neumann")

spde <- inla.spde2.pcmatern(mesh1d,
  prior.range = c(10, 0.5),
  prior.sigma = c(1, 0.5))

cmp <- ~ Intercept(1) +  field(age, model = spde)
lik_spde = bru_obs(formula = triceps ~.,data = triceps)
fit_spde = bru(cmp, lik_spde)

Comparing SPDE and RW2